intermittent observation
Online Learning of Deceptive Policies under Intermittent Observation
Puthumanaillam, Gokul, Padmanabhan, Ram, Fuentes, Jose, Cruz, Nicole, Padrao, Paulo, Hernandez, Ruben, Jiang, Hao, Schafer, William, Bobadilla, Leonardo, Ornik, Melkior
In supervisory control settings, autonomous systems are not monitored continuously. Instead, monitoring often occurs at sporadic intervals within known bounds. We study the problem of deception, where an agent pursues a private objective while remaining plausibly compliant with a supervisor's reference policy when observations occur. Motivated by the behavior of real, human supervisors, we situate the problem within Theory of Mind: the representation of what an observer believes and expects to see. We show that Theory of Mind can be repurposed to steer online reinforcement learning (RL) toward such deceptive behavior. We model the supervisor's expectations and distill from them a single, calibrated scalar -- the expected evidence of deviation if an observation were to happen now. This scalar combines how unlike the reference and current action distributions appear, with the agent's belief that an observation is imminent. Injected as a state-dependent weight into a KL-regularized policy improvement step within an online RL loop, this scalar informs a closed-form update that smoothly trades off self-interest and compliance, thus sidestepping hand-crafted or heuristic policies. In real-world, real-time hardware experiments on marine (ASV) and aerial (UAV) navigation, our ToM-guided RL runs online, achieves high return and success with observed-trace evidence calibrated to the supervisor's expectations.
Two-Channel Extended Kalman Filtering with Intermittent Measurements
Maer, Vicu-Mihalis, Lendek, Zsofia, Pirje, Stefan, Tolic, Domagoj, Djuras, Antun, Prkacin, Vicko, Palunko, Ivana, Busoniu, Lucian
We consider two nonlinear state estimation problems in a setting where an extended Kalman filter receives measurements from two sets of sensors via two channels (2C). In the stochastic-2C problem, the channels drop measurements stochastically, whereas in 2C scheduling, the estimator chooses when to read each channel. In the first problem, we generalize linear-case 2C analysis to obtain -- for a given pair of channel arrival rates -- boundedness conditions for the trace of the error covariance, as well as a worst-case upper bound. For scheduling, an optimization problem is solved to find arrival rates that balance low channel usage with low trace bounds, and channels are read deterministically with the expected periods corresponding to these arrival rates. We validate both solutions in simulations for linear and nonlinear dynamics; as well as in a real experiment with an underwater robot whose position is being intermittently found in a UAV camera image.
- Europe > Romania > Nord-Vest Development Region > Cluj County > Cluj-Napoca (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- (2 more...)
Tracking and Following a Suspended Moving Object using Camera-Based Vision System
Ambrosino, Michele, Mahmalji, Manar, Rosselló, Nicolás Bono, Garone, Emanuele
When robots are able to see and respond to their surroundings, a whole new world of possibilities opens up. To bring these possibilities to life, the robotics industry is increasingly adopting camera-based vision systems, especially when a robotic system needs to interact with a dynamic environment or moving target. However, this kind of vision system is known to have low data transmission rates, packet loss during communication and noisy measurements as major disadvantages. These problems can perturb the control performance and the quality of the robot-environment interaction. To improve the quality of visual information, in this paper, we propose to model the dynamics of the motion of a target object and use this model to implement an Extended Kalman Filter based on Intermittent Observations of the vision system. The effectiveness of the proposed approach was tested through experiments with a robotic arm, a camera device in an eye-to-hand configuration, and an oscillating suspended block as a target to follow.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > Middle East > Jordan (0.04)
Modeling Time-Series and Spatial Data for Recommendations and Other Applications
With the research directions described in this thesis, we seek to address the critical challenges in designing recommender systems that can understand the dynamics of continuous-time event sequences. We follow a ground-up approach, i.e., first, we address the problems that may arise due to the poor quality of CTES data being fed into a recommender system. Later, we handle the task of designing accurate recommender systems. To improve the quality of the CTES data, we address a fundamental problem of overcoming missing events in temporal sequences. Moreover, to provide accurate sequence modeling frameworks, we design solutions for points-of-interest recommendation, i.e., models that can handle spatial mobility data of users to various POI check-ins and recommend candidate locations for the next check-in. Lastly, we highlight that the capabilities of the proposed models can have applications beyond recommender systems, and we extend their abilities to design solutions for large-scale CTES retrieval and human activity prediction. A significant part of this thesis uses the idea of modeling the underlying distribution of CTES via neural marked temporal point processes (MTPP). Traditional MTPP models are stochastic processes that utilize a fixed formulation to capture the generative mechanism of a sequence of discrete events localized in continuous time. In contrast, neural MTPP combine the underlying ideas from the point process literature with modern deep learning architectures. The ability of deep-learning models as accurate function approximators has led to a significant gain in the predictive prowess of neural MTPP models. In this thesis, we utilize and present several neural network-based enhancements for the current MTPP frameworks for the aforementioned real-world applications.
- Europe > United Kingdom (0.13)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Virginia (0.04)
- (17 more...)
- Research Report > Promising Solution (1.00)
- Overview (1.00)
- Research Report > New Finding (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)